Optimizing Data Scheduling on Processor-in-Memory Arrays

نویسندگان

Yi Tian

Edwin Hsing-Mean Sha

Chantana Phongpensri

Peter M. Kogge

چکیده

In the study of PetaFlop project, Processor-In-Memory array was proposed to be a target architecture in achieving 10 floating point operations per second computing performance. However, one of the major obstacles to achieve the fast computing was interprocessor communications, which lengthen the total execution time of an application. A good data scheduling, consisting of finding initial data placement and data movement during the run-time, can give a significant reduction in the total communication cost and the execution time of the application. In this paper, we propose efficient algorithms for the data scheduling problem. Experimental results show the effectiveness of the proposed approaches. Compared with default data distribution methods such as row-wise or column-wise distributions, the average improvement for the tested benchmarks can be up to 30%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Truly Boolean Arrays in Data-Parallel Array Processing

Booleans are the most basic values in computing. Machines, however, store Booleans in larger compounds such as bytes or integers due to limitations in addressing memory locations. For individual values the relative waste of memory capacity is huge, but the absolute waste is negligible. The latter radically changes if large numbers of Boolean values are processed in (multidimensional) arrays. Mo...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

متن کامل

Interface Synthesis using Memory Mapping for an FPGA Platform

Several system-on-chip (SoC) platforms have recently emerged that use reconfigurable logic (FPGAs) as a programmable co-processor to reduce the computational load on the main processor core. We present an interface synthesis approach that enables us to do hardware-software codesign for such FPGA-based platforms. The approach is based on a novel memory mapping algorithm that maps data used by bo...

متن کامل

A new approach to model communication for mapping and scheduling DSP-applications

We present a novel approach to model inter-processor communication in multi-DSP systems. In most multi-DSP systems, inter-processor communication is realized by transferring data over point-to-point links with hardware FIFO bu ers. Direct memory access (DMA) is additionally used to concurrently transfer data to the FIFO bu ers and perform computation. Our model accounts for the limited size of ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Optimizing Data Scheduling on Processor-in-Memory Arrays

نویسندگان

چکیده

منابع مشابه

Towards Truly Boolean Arrays in Data-Parallel Array Processing

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

Interface Synthesis using Memory Mapping for an FPGA Platform

A new approach to model communication for mapping and scheduling DSP-applications

عنوان ژورنال:

اشتراک گذاری